首页> 外文OA文献 >Learning Simpler Language Models with the Differential State Framework
【2h】

Learning Simpler Language Models with the Differential State Framework

机译:使用差异状态框架学习更简单的语言模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Learning useful information across long time lags is a critical and difficultproblem for temporal neural models in tasks such as language modeling. Existingarchitectures that address the issue are often complex and costly to train. TheDifferential State Framework (DSF) is a simple and high-performing design thatunifies previously introduced gated neural models. DSF models maintainlonger-term memory by learning to interpolate between a fast-changingdata-driven representation and a slowly changing, implicitly stable state. Thisrequires hardly any more parameters than a classical, simple recurrent network.Within the DSF framework, a new architecture is presented, the Delta-RNN. Inlanguage modeling at the word and character levels, the Delta-RNN outperformspopular complex architectures, such as the Long Short Term Memory (LSTM) andthe Gated Recurrent Unit (GRU), and, when regularized, performs comparably toseveral state-of-the-art baselines. At the subword level, the Delta-RNN'sperformance is comparable to that of complex gated architectures.
机译:对于诸如语言建模之类的任务中的时态神经模型,跨长时间滞后学习有用信息是一个关键且困难的问题。解决该问题的现有架构通常很复杂且培训成本很高。差异状态框架(DSF)是一种简单而高性能的设计,将先前引入的门控神经模型统一了起来。 DSF模型通过学习在快速变化的数据驱动的表示形式和缓慢变化的隐式稳定状态之间进行插值,从而维护长期记忆。与经典的简单循环网络相比,几乎不需要任何其他参数。在DSF框架内,提出了一种新的体系结构,即Delta-RNN。在单词和字符级别的语言建模中,Delta-RNN优于流行的复杂体系结构,例如长期短期记忆(LSTM)和门控循环单元(GRU),并且在进行规范化后,其性能可与几种最新技术相比基线。在子词级别,Delta-RNN的性能与复杂的门控体系结构相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号